true

Introduction

Salbutamol inhalers are used to relieve the symptoms of respiratory conditions, such as asthma and chronic obstructive pulmonary disease (NHS, 2025). Patients may use their salbutamol inhaler when they experience symptoms, such as breathlessness, tightness in their chest, coughing and wheezing (NHS, 2025). These symptoms are usually exacerbated by the cold weather (Asthma and Lung UK, 2023).

Therefore, this question aims to determine whether there is a relationship between the number of prescriptions for salbutamol inhalers (including branded salbutamol inhalers) and the proportion of households without central heating in NHS Scotland healthboards. We will look at the months which are the coldest in Scotland, the winter months, which for 2021/2022 were December, January and February.

pacman::p_load(tidyverse, janitor, here, readxl, plotly, sf, gt)

Prescription data

Loading and preparing data

#function to clean datasets
clean_prescriptions <- function(df){
  df %>% 
    drop_na() %>%
    clean_names()}

data_dec2021 <- read_csv("https://www.opendata.nhs.scot/dataset/84393984-14e9-4b0d-a797-b288db64d088/resource/ad9e7b46-47fb-4d42-baad-f8e98e8f5936/download/pitc202112.csv") %>%
  clean_prescriptions()

data_jan2022 <- read_csv("https://www.opendata.nhs.scot/dataset/84393984-14e9-4b0d-a797-b288db64d088/resource/53a53d61-3b3b-4a12-888b-a788ce13db9c/download/pitc202201.csv") %>%
  clean_prescriptions()

data_feb2022 <- read_csv("https://www.opendata.nhs.scot/dataset/84393984-14e9-4b0d-a797-b288db64d088/resource/bd7aa5c9-d708-4d0b-9b28-a9d822c84e34/download/pitc202202.csv") %>%
  clean_prescriptions()

Filtering for salbutamol and equivalents

#function to filter for salbutamol inhalers and summarise inhaler prescriptions
salbut_relievers <- c("SALBUTAMOL", "VENTOLIN", "AIROMIR", "ASMALAL", "EASYHALER", "PULVINAL", "SALAMOL", "EASI-BREATHE", "SALBULIN") #names of salbutamol and brand equivalents

salbut_presc <- function(df){
  df %>% 
  filter(str_detect(bnf_item_description,
                    paste(salbut_relievers, collapse = "|"))) %>% 
  filter(!(str_detect(bnf_item_description,
                      "FORMOTEROL|BECLOMET|BUDESONIDE|FOBUMIX"))) %>% 
  select(c(hbt, bnf_item_description, paid_quantity)) %>% #prepare dataset
  group_by(hbt) %>% 
  summarise(prescribed_salbutamol = sum(paid_quantity)) %>% #add new column for sum
  subset(hbt != "SB0806") #removing row with code SB0806
}

#filtering datasets for each winter month in 2021/2022
salbut_dec2021 <- salbut_presc(data_dec2021) %>% 
  rename("salbutamol_dec21" = prescribed_salbutamol)

salbut_jan2022 <- salbut_presc(data_jan2022) %>% 
  rename("salbutamol_jan22" = prescribed_salbutamol)

salbut_feb2022 <- salbut_presc(data_feb2022) %>% 
  rename("salbutamol_feb22" = prescribed_salbutamol)

In this code chunk, I created a function that uses regeular expressions to filter for salbutamol and its alternative brand names prescribed in the NHS (NHS, 2025). I then filtered it to remove any inhaler types that have the same brand name but a different chemical compound e.g. EASYHALER_BECLOMET 200MCG (200 D) (not salbutamol). Then I found the total of all salbutamol inhalers prescribed per healthboard.

I removed data related to the code for Scottish Ambulance Service (SB0806), as my question does not relate to this (Public Health Scotland, 2021).

Joining prescription datasets together

#join datasets for 2021/2022 together
join_prescription <- function(df1, df2){
  df1 %>% 
    full_join(df2, join_by("hbt" == "hbt"))
}
#join winter 2021/2022 data together
joined_salbut_2122 <- join_prescription(salbut_dec2021, salbut_jan2022)

joined_salbut_2122 <- join_prescription(joined_salbut_2122, salbut_feb2022)

In the code chunk above, I join all the totals of salbutamol prescriptions for each month together.

Joining healthboard names to prescription data

HB_names <- read_csv("https://www.opendata.nhs.scot/dataset/9f942fdb-e59e-44f5-b534-d6e17229cc7b/resource/652ff726-e676-4a20-abda-435b98dd7bdc/download/hb14_hb19.csv") %>% 
  clean_names()

joined_salbut_2122 %<>% full_join(HB_names, join_by("hbt" == "hb")) %>% 
  select(c(hbt, hb_name, starts_with("salbutamol"))) %>% 
  filter(!is.na(salbutamol_dec21)) #rows contain old area codes (now archived)

Next, I joined the prescription data to names of the healthboards.

Area codes from row 15 to 18 are archived healthboard codes, but which are now archived due to changes (Public Health Scotland, 2024). Therefore we can remove them.

Joining population data to prescription data

#join population data
population_data <- read_csv("https://www.opendata.nhs.scot/dataset/7f010430-6ce1-4813-b25c-f7f335bdc4dc/resource/27a72cc8-d6d8-430c-8b4f-3109a9ceadb1/download/hb2019_pop_est_03102025.csv")

population_data %<>% 
  filter(Year == "2022" & Sex == "All") %>% 
  select(c(HB, AllAges)) %>% 
  filter(HB != "S92000003") #remove country code for Scotland

salbut_population_2122 <- joined_salbut_2122 %>%
  full_join(population_data, join_by("hbt" == "HB")) %>%
  rename("population" = AllAges) 

S92000003 is the country code for Scotland, so this row tells us the population of Scotland in this year. So, we can remove it. Then I joined the population data with the prescription data.

Plotting a table to show salbutamol prescriptions

#find prescription per 100k of population
salbut_population_2122 %<>%
  rowwise() %>% 
  mutate(average_salbut = mean(c_across(starts_with("Salbutamol")))) %>% 
  mutate(average_salbut_per100k = (average_salbut*100000)/population)

#plot table
salbut_population_2122 %>% 
  select(c(hb_name, starts_with("salbut"), starts_with("average"))) %>% 
  gt() %>% 
  cols_label(hb_name = "Healthboard",
             salbutamol_dec21 = "December 2021",
             salbutamol_jan22 = "January 2022",
             salbutamol_feb22 = "February 2022",
             average_salbut = "Averages across Dec 2021 to Feb 2022",
             average_salbut_per100k = "Averages across Dec 2021 to Feb 2022 per 100k of the population") %>% 
  cols_align(align = "center", columns = !hb_name) %>% 
  fmt_number(columns = starts_with("Average"), decimals = 1) %>% 
  tab_header(title = "Prescriptions of salbutamol and branded equivalents in Winter 2021/22",
             subtitle = "Data from Scottish Healthboards") %>% 
  tab_spanner(label = "Salbutamol prescriptions", 
               columns = c(salbutamol_dec21: salbutamol_feb22))
Prescriptions of salbutamol and branded equivalents in Winter 2021/22
Data from Scottish Healthboards
Healthboard
Salbutamol prescriptions
Averages across Dec 2021 to Feb 2022 Averages across Dec 2021 to Feb 2022 per 100k of the population
December 2021 January 2022 February 2022
NHS Ayrshire and Arran 53452 42913 42147 46,170.7 12,633.9
NHS Borders 11275 9716 9412 10,134.3 8,675.2
NHS Dumfries and Galloway 29285 23135 23268 25,229.3 17,307.6
NHS Forth Valley 39882 32939 32051 34,957.3 11,544.3
NHS Grampian 50023 39070 37530 42,207.7 7,248.4
NHS Highland 33317 24375 23281 26,991.0 8,339.8
NHS Lothian 102904 81019 75596 86,506.3 9,550.3
NHS Orkney 1233 988 910 1,043.7 4,737.5
NHS Shetland 1629 1334 1166 1,376.3 5,978.9
NHS Western Isles 3984 3513 2872 3,456.3 13,232.5
NHS Fife 29045 25488 23370 25,967.7 6,992.0
NHS Tayside 38689 29882 28363 32,311.3 7,799.6
NHS Greater Glasgow and Clyde 281684 233148 219377 244,736.3 20,754.4
NHS Lanarkshire 86363 67401 67132 73,632.0 11,016.5

In this code chunk, I calculated the mean salbutamol prescriptions over the 3 months. Additionally, I calculate the number of salbutamol prescriptions per 100k of the population of each healthboard, which allows for comparisons between healthboards to be made.

The table shows the results.

Heating data 2022

Loading and preparing heating data

#load data for central heating
heating_data2022 <- read_excel(here("data", "table_2025-11-12_19-17-58.xlsx"))

#tidy dataset and change values in hb_area column to include NHS + healthboard name
heating_data2022 %<>% 
  rename(c("hb_area2019" = ...2,
           "occupied_households" = ...3,
           "no_central_heating" = ...4)) %>% 
  select(c(hb_area2019:no_central_heating)) %>%
  mutate(hb_area2019 = paste("NHS", hb_area2019)) %>% #add NHs to name of healthboard
  filter(!row_number() %in% c(1:9, 24:28)) #remove, since rows contain no data

#change data for no_central_heating and occupied_households into integers (from characters) and add a column that shows the proportion of households without central heating
heating_data2022 %<>% mutate(no_central_heating = as.numeric(no_central_heating),
         occupied_households = as.numeric(occupied_households)) %>% 
    mutate(percent_no_heating = (no_central_heating/occupied_households)*100) %>% #add column percent_no_heating 
  select(c(hb_area2019, percent_no_heating))

To find the dataset for census data regarding central heating, go to: https://www.scotlandscensus.gov.uk/webapi/jsf/tableView/tableView.xhtml

I filtered to remove any unnecessary rows in the dataset and converted the data from characters to numeric data. I then calculated the percentage of households with no central heating in each healthboard.

Visualising the percentage of no heating per healthboard

scot_hb <- st_read(here("data", "SG_NHS_HealthBoards_2019", "SG_NHS_HealthBoards_2019.shp")) %>% 
  mutate(HBName = paste("NHS", HBName)) #add NHS before healthboard name
## Reading layer `SG_NHS_HealthBoards_2019' from data source 
##   `C:\Users\Admin\OneDrive\Documents\data_science\B253518\data\SG_NHS_HealthBoards_2019\SG_NHS_HealthBoards_2019.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 14 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 5512.998 ymin: 530250.8 xmax: 470332 ymax: 1220302
## Projected CRS: OSGB36 / British National Grid
#join heating data with geographical data of healthboard boundaries
heating_map_data <- heating_data2022 %>% 
  full_join(scot_hb, join_by("hb_area2019" == "HBName"))

heating_map <- heating_map_data %>% 
  ggplot() +
  geom_sf(aes(geometry = geometry, fill = percent_no_heating)) +
  scale_fill_viridis_c(name = "% of households with no heating") +
  labs(title = "Distribution of households with no central heating",
       subtitle = "by Healthboards 2022")
heating_map

Next, I loaded the shapefile data for the Scottish healthboards. I joined the heating data for each healthboard to the shapefile. I used this to create a visualisation of the distribution of households with no central heating. Interestingly, healthboards further north, which have generally colder conditions, have a greater proportion of households with no central heating than those in the south. This may be due to the presence of a higher rural population in the south of Scotland, which may not have access to the technology required for central heating (Scottish Government, 2021).

To find the shapefiles go to: https://www.data.gov.uk/dataset/27d0fe5f-79bb-4116-aec9-a8e565ff756a/nhs-health-boards-scotland

Plotting the relationship between no heating and salbutamol prescriptions

#join average_salbutamol with heating data
final_salbutamol2122 <- salbut_population_2122 %>% 
  full_join(heating_data2022, by = (c("hb_name" = "hb_area2019"))) %>% 
  rename("healthboard" = hb_name)

#plot the relationship
interactive_plot <- final_salbutamol2122 %>%
  ggplot(aes(x = percent_no_heating,
             y = average_salbut_per100k,
             colour = healthboard)) +
  geom_point() +
  theme_minimal() +
  labs(title = "Average salbutamol prescriptions against % of households without heating",
       subtitle = "in each Scottish healthboard winter months 2021/2022",
        x = "no central heating (%)",
        y = "Salbutamol prescriptions per 100k people") +
  theme(plot.title = element_text(hjust = 0, size = 10),
        plot.subtitle = element_text(hjust = 0))

ggplotly(interactive_plot) %>% 
  layout(title = list(text = paste0("Average salbutamol prescriptions against % of households without heating",
                           "<br>",
                           "<sup>",
                           "in each Scottish healthboard winter months 2021/2022", "</sup>"
                           )))

Finally, I plotted the relationship between percentage of households with no heating and prescriptions per 100k of the population. The results were interesting, since some of the healthboards with a higher proportion of households with no heating (e.g. NHS Orkney) had fewer prescriptions of salbutamol than other areas that had lower proportions of households with no heating.This is unexpected since we know the cold exacerbates respiratory conditions (Asthma and Lung UK, 2023).

For future research, calculating statistics to test if there is a correlation between these two factors may prove beneficial in determining whether the relationship is likely to be significant. It would also be beneficial to look at other socioeconomic demographics, e.g. SIMD, which may give insight into the unusual relationship seen in the above plot.

If you hover over each point, you will be able to see more detailed descriptive statistics.

I did not use generative AI (ChatGPT etc.) to support me in this assessment.

Reference List

Asthma and Lung UK (2023). Cold Weather and Your Lungs. Available at: https://www.asthmaandlung.org.uk/living-with/cold-weather. [Accessed 1 Dec. 2025]

NHS (2025). About salbutamol inhalers. Available at: https://www.nhs.uk/medicines/salbutamol-inhaler/about-salbutamol-inhalers/ [Accessed 1 Dec. 2025].

Public Health Scotland (2021). Special Health Boards and National Facilities. Available at: https://www.opendata.nhs.scot/dataset/non-standard-geography-codes-and-labels/resource/0450a5a2-f600-4569-a9ae-5d6317141899 [Accessed 1 Dec. 2025].

Public Health Scotland (2024). Geography, population and deprivation support - Geography - NHS board or health board. Available at: https://publichealthscotland.scot/resources-and-tools/health-intelligence-and-data-management/geography-population-and-deprivation-support/geography/nhs-board-or-health-board/ [Accessed 1 Dec. 2025].

Scottish Government (2021). Rural Scotland Key Facts 2021. Available at: https://www.gov.scot/publications/rural-scotland-key-facts-2021/ [Accessed 1 Dec. 2025].